Tutorial 6

Monday, June 24, 2013

12:28 PM

    Hashing

    • Next tutorial cancelled (Canaday)

     

      • Use an array, key=index

     

     

    • What is a good hash function? Non trivial.
    • Depends on input distribution and object type
    • Important 
    • This is impossible by Pigeonhole principle
    • No perfection has function exists

     

    • What if worst case doesn’t matter
    • Good input distribution, worst case is Rare
    • Uniform Hashing Assumption
      • Each hash value is equally likely
      • This is reasonable practice

     

    • Be very careful in how you justify why worst case doesn't matter (Application dependent)
    • Don't use hashing if program is mission critical.
      • Eg
        • User controlled input -> DOS attack

     

    • Load factors
        • Small load -> wasted space
        • Large load lots of collision & poor performance

     

    • Rehash (Rebuild with new hash function & different table size) to keep load reasonable.

     

    Collision Resolution

    • Chaining - make a linked list for each bucket (hashable)
      • Ex

     

     

    Open addressing

    • Define a sequence of spots where a key can go for every key

     

    • Linear probing

    • Deletion: Leave a flag to mark it as previously occupied.

     

    • Double hashing

     

    • Cuckoo hashing

    Insert(k)

     

 

Created with Microsoft OneNote 2010
One place for all your notes and information